Modulation of human extrastriate visual processing by audio-visual attention 
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1. Introduction 

The orientation of attention within the specific 
sensory modality is known to evoke enhanced 
activity in the modality-specific and/or stimulus- 
feature-specific region in the human brain by using 
hemodynamic functional brain imaging methods 
(fMRI, PET) [1-4] and electrophysiological methods 
(EEG, MEG) [5-8], Particularly, selective attention 
to specific stimulus features like color, form, 
movement and face in the visual modality 
selectively modulates the regional cerebral blood 
flow (rCBF) in the regions of extrastriate visual 
cortex specialized for the processing of 
corresponding stimulus features [4], Similar results 
for spatial attention were also found using PET and 
event-related brain potentials (ERPs), in which the 
attention-related enhancement of the neural 
activities were observed in the contralateral 
extrastriate cortex [8], The present study focused on 
the effects of intermodal orientation of attention on 
the information processing in the human visual 
cortical area, to answer the question whether 
intermodal attention modulates neural activity in the 
stimulus-feature specific area in the human 
extrastriate visual cortex at the specific latency. We 
used whole-cortex type SQUID (superconducting 
quantum interference device) system to record the 
neuromagnetic brain responses in high temporal 
resolution up to few milliseconds, and the array 
signal processing technique, which is the 
combination of the linear synthetic filtering and the 
inter-subject statistical analyses, to detect the 
attentional modulation of the neural activities. 

2, Methods 

We measured the brain magnetic fields while nine 
healthy subjects performing the audio-visual (1) 
spatial discrimination task and (2) color/ffequency 
discrimination task under three different intermodal 
attentional conditions. 

In the spatial discrimination task, an inter-mixed 
sequence of auditory or visual stimuli was presented 
at the same probability of occurrence in a 
randomized order. The visual stimuli were presented 


by red light-emitting diodes (LED) located in the 
left or right visual field separated by a visual angle 
of 5.7 degrees from a fixation point, and the auditory 
stimuli were 1000 Hz tone bursts (200 ms duration, 
5 ms rise/fall times), which were presented either 
left or right ear. The subjects were required to 
discern which visual field or ear was stimulated 
under three audio-visual attentional conditions; (a) 
attention to vision (respond only to visual stimuli 
and ignore auditory stimuli), (b) attention to audition 
(respond only to auditory stimuli and ignore visual 
stimuli), and (c) divided attention to both modalities 
(respond to both visual and auditory stimuli) as 
neutral condition, and to react by lifting right index 
or middle finger as soon as possible. 

Similar stimulus sequences were presented in the 
color/frequency discrimination task, but the visual 
stimuli were consisted of the red or green flash of an 
LED located in the left visual field separated by a 
visual angle of 5.7 degrees from a fixation point, and 
the auditory stimuli were consisted of the 1000 Hz 
or 2000 Hz tone burst presented randomly to the 
subject’s left ear. The subjects were required to 
discriminate which color the LED flashes or which 
tone burst (low or high) was delivered under three 
intermodal attentional conditions same as the spatial 
task. 

In both spatial and color experiments, MEG signals 
were measured separately for each attentional 
condition using 122 channel whole-cortex-type 
neuromagnetometer. The stimulus-related epochs of 
700 ms, including a 200 ms pre-stimulus baseline, 
were recorded with a pass-band of 0.03 - 100 Hz 
and a sampling rate of 498 Hz. More than 50 
epochs were averaged for each visual task condition. 
The epochs with an MEG signal change exceeding 
1500 fT/cm were rejected from signal averaging to 
exclude the signal contamination by large external 
noise. 

For each subject, distributions of neural activities 
during the spatial and color discrimination tasks 
under three attentional conditions were estimated 
within a latency range of 100 to 500 ms after the 
visual task stimulus onset with a time interval of 10 
ms using the linearly constrained minimum variance 



(LCMV) synthetic filtering algorithm [9] [10]. The 
synthetic filters are implemented as a weighted sum 
of the data recorded at the MEG sensor array. These 
weights were determined so as to minimize the filter 
output power subject to a linear constraint, which 
forces the filter to pass a specific distribution of the 
measured signal generated from a brain electrical 
source at a specified location, while the 
minimization of power attenuates activity 
originating at other locations. In calculating the 
functional activation map using an LCMV synthetic 
filter, the filter output power as a function of 
location is normalized by the estimated noise power 
to compensate the noise dominance at the locations 
where the sensitivity of the sensor array is low, 
which is refer to as a neural activity index (NAI) 
[10]. In this study, the noise power distribution was 
estimated from the pre-stimulus period of 200 ms. 
The synthetic filter output was evaluated at 1190 
points located on the uniform grid points with 10 
mm spacing in each direction in the sphere head 
model. Functional activation maps corresponding to 
the three attentional conditions for nine subjects, 
obtained by the linear synthetic filtering on each 
grid point at 10 ms interval, were submitted to the 
voxel-by-voxel statistical tests in the latency range 
of 100-350 ms at 10 ms interval. Analysis of 


variance (ANOVA) with Dunnet’s post-hoc test was 
used to detect the significant differences of neural 
activation between the three attentional conditions. 

3. Results 

For all the subjects, the percentages of errors in task 
performance were less than 5% in both the spatial 
and non-spatial discrimination tasks. Figure 1 
shows the distributions of estimated neural activity 
for the typical subject during spatial (Figure. 1 A) 
and color (Figured B) discrimination tasks under 
three attentional conditions, which are estimated at 
200 and 230 ms after the visual stimulus onset to the 
left visual field, respectively. The distributions of 
the neural activity were superimposed on MR 
images of the subject’s head. The size of the dots in 
each MRI slice indicates the output of the synthetic 
neural current reconstruction filter, reflecting the 
estimated intensity of the neural activity. In the 
spatial task, differences in the neural activity 
between three attentional conditions were clearly 
observed in contralateral occipito-temporal region 
(Brodmann’s area (BA) 18), while the differences in 
the posterior portion of the contralateral infero- 
temporal region (BA 19,37) were observed in the 
color task. No clear activation was found in the 


(A) Spatial task 


(B) Color task 




Figure 1: Estimated neural activity distribution for three attentional conditions at about 200ms after the 
visual stimulus to the left visual field in the (A) spatial discrimination task and (B) color 
discrimination task, superimposed on the MR images of the subject’s head. 







































“attention to audition” condition in both spatial and 
color tasks. 

Results of the inter-subject statistical analysis at 
contralateral (i) posterior portion of the infero- 
temporal region, (ii) occipito-temporal region, and 
(iii) occipital region are shown in Figure.2. Each 
graph indicates the average time course (nine 
subjects) of the estimated neural activity at each 
point during three attentional conditions (thick solid 
lines refer to the “attention to vision” condition, thin 
solid lines to the “attention to both” condition, and 
broken lines to the “attention to audition” 
condition). White and black triangles indicate the 
latencies of statistically significant differences 
between neutral condition and the selective attention 
condition to visual or auditory modality measured 


by the ANOVA, where white triangles for p < 0.05 
and black triangles for p < 0.01, respectively. In 
both spatial and color discrimination tasks, neural 
activity was decreased in all regions under 
consideration in the “attention to audition” 
condition. Clear enhancement of neural activity in 
the occipito-temporal region under the “attention to 
vision” condition was observed in the spatial 
discrimination task in the latency range between 190 
and 260 ms after the visual stimulus onset, while no 
enhancement of activation was observed in the 
posterior infero-temporal region. In the color 
discrimination task, enhanced activation in posterior 
infero-temporal region under “attention to vision” 
condition was found in the latency range between 
190 and 340 ms, while no enhancement of activation 
was observed in the occipito-temporal region. 



““ NAI during “attention to vision” condition 
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▼ NAI(vision) > NAI(neutral): p<0.01 
V NAI(vision) > NAI(neutral): p<0.05 
A NAI(audition) < NAI(neutral): p<0.05 
A NAI(audition) < NAI(neutral): p<0.05 
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Figure 2: Inter-subject analysis of the neural activity index corresponding to the visual stimuli during (A) 
spatial task and (B) non-spatial task, estimated in the (i) infero-temporal region, (ii) occipito¬ 
temporal region, and (iii) occipital region. Each line shows the grand average of the estimated 
NAI for 9 subjects. The triangles show the latencies of significant changes between the 
auditory/visual attention condition and the neutral condition. 














4. Discussion 

These results indicate that the audio-visual 
intermodal orientation of attention modulates the 
human visual information processing in the 
extrastriate visual regions in a stimulus feature 
specific manner. The brain region with clear 
intermodal attentional effect in the spatial task was 
largely overlapped with the region where clear 
attention-related modulation was observed in the 
previous studies of selective attention to location 
within the visual modality. On the other hand, 
although activation in the occipito-temporal region 
including V4 was observed, attention to vision in the 
color task modulated the neural activities not in 
occipito-temporal region including V4 where clear 
visual stimulus-feature specific attentional effect 
within the visual modality have been reported, but in 
the anterior portion of the fusiform gyrus. Recent 
fMRI and ERP studies have suggested that, in 
addition to V4, posterior portion of the inferior 
temporal cortex in human brain including fusiform 
gyrus was deeply concerned with higher-order color 
processing [7][11]. The current results suggested 
that the intermodal orientation of attention to vision 
modulate human higher-order color processing in 
anterior fusiform gyrus. 

In addition, the present results have elucidated the 
temporal characteristics of the neural activity in 
these regions. In the spatial discrimination task, the 
attentional enhancement of the neural activity in the 
occipito-temporal region was observed in the 
latency range between 190 and 260 ms after the 
visual stimulus onset, whereas the enhancement in 
the posterior infero-temporal region in the color 
discrimination task was continued until 340 ms after 
the stimulus onset. These differences might be due 
to the temporal dissociation between the location 
and color processing in the human extrastriate visual 
region. 

Attention related depression of neural activity was 
observed in all areas from occipital to posterior 
infero-temporal region when the subjects’ attention 
was directed to audition during both spatial and 
color discrimination tasks. These results agree with 
the previous PET studies that report the intermodal 
deactivation in cortical regions not involved in 
performing a specific task [4] [12]. 
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